Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 38471 |
| Missing cells | 67074 |
| Missing cells (%) | 7.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.3 MiB |
| Average record size in memory | 200.0 B |
Variable types
| Numeric | 8 |
|---|---|
| DateTime | 2 |
| Categorical | 14 |
| Boolean | 1 |
Returned has constant value "True" | Constant |
Customer ID has a high cardinality: 1587 distinct values | High cardinality |
Customer Name has a high cardinality: 795 distinct values | High cardinality |
City has a high cardinality: 3475 distinct values | High cardinality |
State has a high cardinality: 1072 distinct values | High cardinality |
Country has a high cardinality: 147 distinct values | High cardinality |
Product ID has a high cardinality: 9815 distinct values | High cardinality |
Product Name has a high cardinality: 3750 distinct values | High cardinality |
Sub-Category is highly correlated with Returned and 1 other fields | High correlation |
Region is highly correlated with Returned | High correlation |
Returned is highly correlated with Sub-Category and 6 other fields | High correlation |
Order Priority is highly correlated with Returned | High correlation |
Category is highly correlated with Sub-Category and 1 other fields | High correlation |
Ship Mode is highly correlated with Returned | High correlation |
Market is highly correlated with Returned | High correlation |
Segment is highly correlated with Returned | High correlation |
Postal Code has 30915 (80.4%) missing values | Missing |
Returned has 36159 (94.0%) missing values | Missing |
df_index is uniformly distributed | Uniform |
df_index has unique values | Unique |
Discount has 21767 (56.6%) zeros | Zeros |
Profit has 510 (1.3%) zeros | Zeros |
ship_delay has 1987 (5.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-10-15 07:52:47.756270 |
|---|---|
| Analysis finished | 2021-10-15 07:52:59.188527 |
| Duration | 11.43 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
| Distinct | 38471 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25664.07925 |
|---|---|
| Minimum | 0 |
| Maximum | 51294 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2572.5 |
| Q1 | 12835.5 |
| median | 25602 |
| Q3 | 38518.5 |
| 95-th percentile | 48757.5 |
| Maximum | 51294 |
| Range | 51294 |
| Interquartile range (IQR) | 25683 |
Descriptive statistics
| Standard deviation | 14811.31478 |
|---|---|
| Coefficient of variation (CV) | 0.5771223908 |
| Kurtosis | -1.199828273 |
| Mean | 25664.07925 |
| Median Absolute Deviation (MAD) | 12841 |
| Skewness | 0.00232785384 |
| Sum | 987322793 |
| Variance | 219375045.4 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 19164 | 1 | < 0.1% |
| 41697 | 1 | < 0.1% |
| 47842 | 1 | < 0.1% |
| 45795 | 1 | < 0.1% |
| 35556 | 1 | < 0.1% |
| 33509 | 1 | < 0.1% |
| 37607 | 1 | < 0.1% |
| 49901 | 1 | < 0.1% |
| 8945 | 1 | < 0.1% |
| Other values (38461) | 38461 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 5 | 1 | |
| 8 | 1 |
| Value | Count | Frequency (%) |
| 51294 | 1 | |
| 51293 | 1 | |
| 51292 | 1 | |
| 51291 | 1 | |
| 51290 | 1 |
Order Date
Date
| Distinct | 1424 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Minimum | 2011-01-01 00:00:00 |
|---|---|
| Maximum | 2014-12-31 00:00:00 |
Ship Date
Date
| Distinct | 1463 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Minimum | 2011-01-03 00:00:00 |
|---|---|
| Maximum | 2015-01-07 00:00:00 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Standard Class | |
|---|---|
| Second Class | |
| First Class | |
| Same Day | 2065 |
Length
| Max length | 14 |
|---|---|
| Median length | 14 |
| Mean length | 12.84344051 |
| Min length | 8 |
Characters and Unicode
| Total characters | 494100 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Standard Class |
|---|---|
| 2nd row | Standard Class |
| 3rd row | Second Class |
| 4th row | Second Class |
| 5th row | First Class |
| Value | Count | Frequency (%) |
| Standard Class | 23148 | |
| Second Class | 7670 | 19.9% |
| First Class | 5588 | 14.5% |
| Same Day | 2065 | 5.4% |
| Value | Count | Frequency (%) |
| class | 36406 | |
| standard | 23148 | |
| second | 7670 | 10.0% |
| first | 5588 | 7.3% |
| same | 2065 | 2.7% |
| day | 2065 | 2.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 86832 | |
| s | 78400 | |
| d | 53966 | |
| 38471 | ||
| C | 36406 | |
| l | 36406 | |
| S | 32883 | 6.7% |
| n | 30818 | 6.2% |
| t | 28736 | 5.8% |
| r | 28736 | 5.8% |
| Other values (8) | 42446 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 378687 | |
| Uppercase Letter | 76942 | 15.6% |
| Space Separator | 38471 | 7.8% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 86832 | |
| s | 78400 | |
| d | 53966 | |
| l | 36406 | |
| n | 30818 | 8.1% |
| t | 28736 | 7.6% |
| r | 28736 | 7.6% |
| e | 9735 | 2.6% |
| c | 7670 | 2.0% |
| o | 7670 | 2.0% |
| Other values (3) | 9718 | 2.6% |
| Value | Count | Frequency (%) |
| C | 36406 | |
| S | 32883 | |
| F | 5588 | 7.3% |
| D | 2065 | 2.7% |
| Value | Count | Frequency (%) |
| 38471 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 455629 | |
| Common | 38471 | 7.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 86832 | |
| s | 78400 | |
| d | 53966 | |
| C | 36406 | |
| l | 36406 | |
| S | 32883 | 7.2% |
| n | 30818 | 6.8% |
| t | 28736 | 6.3% |
| r | 28736 | 6.3% |
| e | 9735 | 2.1% |
| Other values (7) | 32711 | 7.2% |
| Value | Count | Frequency (%) |
| 38471 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 494100 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 86832 | |
| s | 78400 | |
| d | 53966 | |
| 38471 | ||
| C | 36406 | |
| l | 36406 | |
| S | 32883 | 6.7% |
| n | 30818 | 6.2% |
| t | 28736 | 5.8% |
| r | 28736 | 5.8% |
| Other values (8) | 42446 |
| Distinct | 1587 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| PO-18850 | 79 |
|---|---|
| CK-12205 | 71 |
| BE-11335 | 70 |
| ZC-21910 | 68 |
| EM-13960 | 67 |
| Other values (1582) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.815887292 |
| Min length | 5 |
Characters and Unicode
| Total characters | 300685 |
|---|---|
| Distinct characters | 40 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | SP-20860 |
|---|---|
| 2nd row | JD-15895 |
| 3rd row | AB-10600 |
| 4th row | GH-14410 |
| 5th row | KW-16435 |
| Value | Count | Frequency (%) |
| PO-18850 | 79 | 0.2% |
| CK-12205 | 71 | 0.2% |
| BE-11335 | 70 | 0.2% |
| ZC-21910 | 68 | 0.2% |
| EM-13960 | 67 | 0.2% |
| JG-15805 | 67 | 0.2% |
| BW-11110 | 66 | 0.2% |
| SW-20755 | 65 | 0.2% |
| WB-21850 | 64 | 0.2% |
| MY-18295 | 64 | 0.2% |
| Other values (1577) | 37790 |
| Value | Count | Frequency (%) |
| po-18850 | 79 | 0.2% |
| ck-12205 | 71 | 0.2% |
| be-11335 | 70 | 0.2% |
| zc-21910 | 68 | 0.2% |
| jg-15805 | 67 | 0.2% |
| em-13960 | 67 | 0.2% |
| bw-11110 | 66 | 0.2% |
| sw-20755 | 65 | 0.2% |
| mp-17965 | 64 | 0.2% |
| my-18295 | 64 | 0.2% |
| Other values (1577) | 37790 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 41114 | |
| - | 38471 | |
| 0 | 32455 | 10.8% |
| 5 | 29920 | 10.0% |
| 2 | 16176 | 5.4% |
| 8 | 11089 | 3.7% |
| 6 | 11031 | 3.7% |
| 7 | 11010 | 3.7% |
| 3 | 10926 | 3.6% |
| 4 | 10800 | 3.6% |
| Other values (30) | 87693 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 185272 | |
| Uppercase Letter | 76797 | |
| Dash Punctuation | 38471 | 12.8% |
| Lowercase Letter | 145 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| M | 6765 | 8.8% |
| C | 6660 | 8.7% |
| S | 6555 | 8.5% |
| B | 6339 | 8.3% |
| D | 4908 | 6.4% |
| J | 4649 | 6.1% |
| A | 4415 | 5.7% |
| H | 3920 | 5.1% |
| P | 3880 | 5.1% |
| R | 3647 | 4.7% |
| Other values (16) | 25059 |
| Value | Count | Frequency (%) |
| 1 | 41114 | |
| 0 | 32455 | |
| 5 | 29920 | |
| 2 | 16176 | 8.7% |
| 8 | 11089 | 6.0% |
| 6 | 11031 | 6.0% |
| 7 | 11010 | 5.9% |
| 3 | 10926 | 5.9% |
| 4 | 10800 | 5.8% |
| 9 | 10751 | 5.8% |
| Value | Count | Frequency (%) |
| p | 57 | |
| o | 54 | |
| l | 34 |
| Value | Count | Frequency (%) |
| - | 38471 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 223743 | |
| Latin | 76942 | 25.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| M | 6765 | 8.8% |
| C | 6660 | 8.7% |
| S | 6555 | 8.5% |
| B | 6339 | 8.2% |
| D | 4908 | 6.4% |
| J | 4649 | 6.0% |
| A | 4415 | 5.7% |
| H | 3920 | 5.1% |
| P | 3880 | 5.0% |
| R | 3647 | 4.7% |
| Other values (19) | 25204 |
| Value | Count | Frequency (%) |
| 1 | 41114 | |
| - | 38471 | |
| 0 | 32455 | |
| 5 | 29920 | |
| 2 | 16176 | 7.2% |
| 8 | 11089 | 5.0% |
| 6 | 11031 | 4.9% |
| 7 | 11010 | 4.9% |
| 3 | 10926 | 4.9% |
| 4 | 10800 | 4.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 300685 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 41114 | |
| - | 38471 | |
| 0 | 32455 | 10.8% |
| 5 | 29920 | 10.0% |
| 2 | 16176 | 5.4% |
| 8 | 11089 | 3.7% |
| 6 | 11031 | 3.7% |
| 7 | 11010 | 3.7% |
| 3 | 10926 | 3.6% |
| 4 | 10800 | 3.6% |
| Other values (30) | 87693 |
| Distinct | 795 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| c1301105bfca084673e0bb8fa3103221 | 84 |
|---|---|
| 8fe3138a7ef91d7f8635f63b9d5331ad | 83 |
| f054cc2c916e6fd23d9afd7e4f101362 | 81 |
| 57a1a3a30c5c54262ba894270d3c3314 | 81 |
| 9d5201e7963b7f4c2136b5168dbd91f9 | 80 |
| Other values (790) |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 1231072 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | a7d03c30d416fc5f7d695b495884fdd7 |
|---|---|
| 2nd row | 1b2850c124acd1bc24237b4b5228b65e |
| 3rd row | 6acab08bb2b385c8569adfd24730ee01 |
| 4th row | 1528a0a296f3ecf500753855ea9a21a5 |
| 5th row | 648a7c6f93ee0f453ee1378466a84ff8 |
| Value | Count | Frequency (%) |
| c1301105bfca084673e0bb8fa3103221 | 84 | 0.2% |
| 8fe3138a7ef91d7f8635f63b9d5331ad | 83 | 0.2% |
| f054cc2c916e6fd23d9afd7e4f101362 | 81 | 0.2% |
| 57a1a3a30c5c54262ba894270d3c3314 | 81 | 0.2% |
| 9d5201e7963b7f4c2136b5168dbd91f9 | 80 | 0.2% |
| 2d806890acc865414ad191e4f11ec62a | 77 | 0.2% |
| 0e64857da6f1a22cf71a0bdefb9f2bbc | 74 | 0.2% |
| a9066b389900001da23e2dd934673faf | 74 | 0.2% |
| 3e8c46cbd78f47c95668adf74cef15af | 74 | 0.2% |
| cdb986bad53909051244769475ad755f | 73 | 0.2% |
| Other values (785) | 37690 |
| Value | Count | Frequency (%) |
| c1301105bfca084673e0bb8fa3103221 | 84 | 0.2% |
| 8fe3138a7ef91d7f8635f63b9d5331ad | 83 | 0.2% |
| f054cc2c916e6fd23d9afd7e4f101362 | 81 | 0.2% |
| 57a1a3a30c5c54262ba894270d3c3314 | 81 | 0.2% |
| 9d5201e7963b7f4c2136b5168dbd91f9 | 80 | 0.2% |
| 2d806890acc865414ad191e4f11ec62a | 77 | 0.2% |
| 0e64857da6f1a22cf71a0bdefb9f2bbc | 74 | 0.2% |
| a9066b389900001da23e2dd934673faf | 74 | 0.2% |
| 3e8c46cbd78f47c95668adf74cef15af | 74 | 0.2% |
| cdb986bad53909051244769475ad755f | 73 | 0.2% |
| Other values (785) | 37690 |
Most occurring characters
| Value | Count | Frequency (%) |
| b | 81147 | 6.6% |
| d | 79935 | 6.5% |
| e | 79877 | 6.5% |
| 9 | 79333 | 6.4% |
| 4 | 78484 | 6.4% |
| 0 | 78114 | 6.3% |
| 1 | 77314 | 6.3% |
| f | 77062 | 6.3% |
| 7 | 76657 | 6.2% |
| 2 | 76536 | 6.2% |
| Other values (6) | 446613 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 764340 | |
| Lowercase Letter | 466732 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 9 | 79333 | |
| 4 | 78484 | |
| 0 | 78114 | |
| 1 | 77314 | |
| 7 | 76657 | |
| 2 | 76536 | |
| 3 | 75898 | |
| 8 | 74417 | |
| 5 | 74169 | |
| 6 | 73418 |
| Value | Count | Frequency (%) |
| b | 81147 | |
| d | 79935 | |
| e | 79877 | |
| f | 77062 | |
| a | 74960 | |
| c | 73751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 764340 | |
| Latin | 466732 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 9 | 79333 | |
| 4 | 78484 | |
| 0 | 78114 | |
| 1 | 77314 | |
| 7 | 76657 | |
| 2 | 76536 | |
| 3 | 75898 | |
| 8 | 74417 | |
| 5 | 74169 | |
| 6 | 73418 |
| Value | Count | Frequency (%) |
| b | 81147 | |
| d | 79935 | |
| e | 79877 | |
| f | 77062 | |
| a | 74960 | |
| c | 73751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1231072 |
Most frequent character per block
| Value | Count | Frequency (%) |
| b | 81147 | 6.6% |
| d | 79935 | 6.5% |
| e | 79877 | 6.5% |
| 9 | 79333 | 6.4% |
| 4 | 78484 | 6.4% |
| 0 | 78114 | 6.3% |
| 1 | 77314 | 6.3% |
| f | 77062 | 6.3% |
| 7 | 76657 | 6.2% |
| 2 | 76536 | 6.2% |
| Other values (6) | 446613 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Consumer | |
|---|---|
| Corporate | |
| Home Office |
Length
| Max length | 11 |
|---|---|
| Median length | 8 |
| Mean length | 8.842556731 |
| Min length | 8 |
Characters and Unicode
| Total characters | 340182 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Corporate |
|---|---|
| 2nd row | Corporate |
| 3rd row | Corporate |
| 4th row | Home Office |
| 5th row | Consumer |
| Value | Count | Frequency (%) |
| Consumer | 20019 | |
| Corporate | 11471 | |
| Home Office | 6981 | 18.1% |
| Value | Count | Frequency (%) |
| consumer | 20019 | |
| corporate | 11471 | |
| office | 6981 | 15.4% |
| home | 6981 | 15.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 49942 | |
| e | 45452 | |
| r | 42961 | |
| C | 31490 | |
| m | 27000 | |
| n | 20019 | 5.9% |
| s | 20019 | 5.9% |
| u | 20019 | 5.9% |
| f | 13962 | 4.1% |
| p | 11471 | 3.4% |
| Other values (7) | 57847 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 287749 | |
| Uppercase Letter | 45452 | 13.4% |
| Space Separator | 6981 | 2.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| o | 49942 | |
| e | 45452 | |
| r | 42961 | |
| m | 27000 | |
| n | 20019 | |
| s | 20019 | |
| u | 20019 | |
| f | 13962 | 4.9% |
| p | 11471 | 4.0% |
| a | 11471 | 4.0% |
| Other values (3) | 25433 |
| Value | Count | Frequency (%) |
| C | 31490 | |
| H | 6981 | 15.4% |
| O | 6981 | 15.4% |
| Value | Count | Frequency (%) |
| 6981 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 333201 | |
| Common | 6981 | 2.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| o | 49942 | |
| e | 45452 | |
| r | 42961 | |
| C | 31490 | |
| m | 27000 | |
| n | 20019 | 6.0% |
| s | 20019 | 6.0% |
| u | 20019 | 6.0% |
| f | 13962 | 4.2% |
| p | 11471 | 3.4% |
| Other values (6) | 50866 |
| Value | Count | Frequency (%) |
| 6981 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 340182 |
Most frequent character per block
| Value | Count | Frequency (%) |
| o | 49942 | |
| e | 45452 | |
| r | 42961 | |
| C | 31490 | |
| m | 27000 | |
| n | 20019 | 5.9% |
| s | 20019 | 5.9% |
| u | 20019 | 5.9% |
| f | 13962 | 4.1% |
| p | 11471 | 3.4% |
| Other values (7) | 57847 |
| Distinct | 3475 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| New York City | 707 |
|---|---|
| Los Angeles | 580 |
| Philadelphia | 402 |
| San Francisco | 376 |
| Manila | 323 |
| Other values (3470) |
Length
| Max length | 35 |
|---|---|
| Median length | 8 |
| Mean length | 8.424709521 |
| Min length | 2 |
Characters and Unicode
| Total characters | 324107 |
|---|---|
| Distinct characters | 76 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 3 ? |
Unique
| Unique | 591 ? |
|---|---|
| Unique (%) | 1.5% |
Sample
| 1st row | Murfreesboro |
|---|---|
| 2nd row | Oosterhout |
| 3rd row | Phnom Penh |
| 4th row | Lima |
| 5th row | London |
| Value | Count | Frequency (%) |
| New York City | 707 | 1.8% |
| Los Angeles | 580 | 1.5% |
| Philadelphia | 402 | 1.0% |
| San Francisco | 376 | 1.0% |
| Manila | 323 | 0.8% |
| Santo Domingo | 321 | 0.8% |
| Seattle | 315 | 0.8% |
| Houston | 284 | 0.7% |
| Tegucigalpa | 263 | 0.7% |
| Lagos | 260 | 0.7% |
| Other values (3465) | 34640 |
| Value | Count | Frequency (%) |
| city | 1348 | 2.8% |
| san | 1271 | 2.7% |
| new | 737 | 1.5% |
| york | 734 | 1.5% |
| los | 682 | 1.4% |
| angeles | 584 | 1.2% |
| de | 464 | 1.0% |
| francisco | 408 | 0.9% |
| philadelphia | 402 | 0.8% |
| santo | 339 | 0.7% |
| Other values (3638) | 40989 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 41415 | 12.8% |
| n | 24347 | 7.5% |
| e | 23915 | 7.4% |
| o | 22914 | 7.1% |
| i | 20255 | 6.2% |
| r | 17864 | 5.5% |
| l | 16110 | 5.0% |
| s | 12155 | 3.8% |
| t | 11937 | 3.7% |
| u | 11734 | 3.6% |
| Other values (66) | 121461 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 265795 | |
| Uppercase Letter | 47596 | 14.7% |
| Space Separator | 9487 | 2.9% |
| Dash Punctuation | 955 | 0.3% |
| Other Punctuation | 266 | 0.1% |
| Open Punctuation | 3 | < 0.1% |
| Close Punctuation | 3 | < 0.1% |
| Final Punctuation | 2 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 41415 | |
| n | 24347 | 9.2% |
| e | 23915 | 9.0% |
| o | 22914 | 8.6% |
| i | 20255 | 7.6% |
| r | 17864 | 6.7% |
| l | 16110 | 6.1% |
| s | 12155 | 4.6% |
| t | 11937 | 4.5% |
| u | 11734 | 4.4% |
| Other values (32) | 63149 |
| Value | Count | Frequency (%) |
| S | 5624 | |
| C | 5417 | |
| M | 4539 | 9.5% |
| B | 3498 | 7.3% |
| L | 3223 | 6.8% |
| A | 3104 | 6.5% |
| P | 3022 | 6.3% |
| T | 1974 | 4.1% |
| D | 1917 | 4.0% |
| N | 1863 | 3.9% |
| Other values (17) | 13415 |
| Value | Count | Frequency (%) |
| ' | 262 | |
| . | 4 | 1.5% |
| Value | Count | Frequency (%) |
| 9487 |
| Value | Count | Frequency (%) |
| - | 955 |
| Value | Count | Frequency (%) |
| ’ | 2 |
| Value | Count | Frequency (%) |
| ( | 3 |
| Value | Count | Frequency (%) |
| ) | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 313391 | |
| Common | 10716 | 3.3% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 41415 | 13.2% |
| n | 24347 | 7.8% |
| e | 23915 | 7.6% |
| o | 22914 | 7.3% |
| i | 20255 | 6.5% |
| r | 17864 | 5.7% |
| l | 16110 | 5.1% |
| s | 12155 | 3.9% |
| t | 11937 | 3.8% |
| u | 11734 | 3.7% |
| Other values (59) | 110745 |
| Value | Count | Frequency (%) |
| 9487 | ||
| - | 955 | 8.9% |
| ' | 262 | 2.4% |
| . | 4 | < 0.1% |
| ( | 3 | < 0.1% |
| ) | 3 | < 0.1% |
| ’ | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 322287 | |
| None | 1818 | 0.6% |
| Punctuation | 2 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 41415 | 12.9% |
| n | 24347 | 7.6% |
| e | 23915 | 7.4% |
| o | 22914 | 7.1% |
| i | 20255 | 6.3% |
| r | 17864 | 5.5% |
| l | 16110 | 5.0% |
| s | 12155 | 3.8% |
| t | 11937 | 3.7% |
| u | 11734 | 3.6% |
| Other values (48) | 119641 |
| Value | Count | Frequency (%) |
| á | 480 | |
| í | 386 | |
| ó | 313 | |
| é | 224 | |
| ã | 194 | |
| ú | 61 | 3.4% |
| ü | 43 | 2.4% |
| ç | 33 | 1.8% |
| ñ | 24 | 1.3% |
| â | 20 | 1.1% |
| Other values (7) | 40 | 2.2% |
| Value | Count | Frequency (%) |
| ’ | 2 |
| Distinct | 1072 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| California | 1518 |
|---|---|
| England | 1123 |
| New York | 866 |
| Texas | 759 |
| Ile-de-France | 725 |
| Other values (1067) |
Length
| Max length | 36 |
|---|---|
| Median length | 8 |
| Mean length | 9.641470198 |
| Min length | 3 |
Characters and Unicode
| Total characters | 370917 |
|---|---|
| Distinct characters | 84 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 3 ? |
Unique
| Unique | 91 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | Tennessee |
|---|---|
| 2nd row | North Brabant |
| 3rd row | Phnom Penh |
| 4th row | Lima (city) |
| 5th row | England |
| Value | Count | Frequency (%) |
| California | 1518 | 3.9% |
| England | 1123 | 2.9% |
| New York | 866 | 2.3% |
| Texas | 759 | 2.0% |
| Ile-de-France | 725 | 1.9% |
| New South Wales | 600 | 1.6% |
| North Rhine-Westphalia | 543 | 1.4% |
| Queensland | 539 | 1.4% |
| San Salvador | 468 | 1.2% |
| National Capital | 440 | 1.1% |
| Other values (1062) | 30890 |
| Value | Count | Frequency (%) |
| california | 1617 | 3.2% |
| new | 1609 | 3.1% |
| england | 1123 | 2.2% |
| south | 912 | 1.8% |
| york | 866 | 1.7% |
| north | 864 | 1.7% |
| texas | 759 | 1.5% |
| ile-de-france | 725 | 1.4% |
| wales | 627 | 1.2% |
| san | 596 | 1.2% |
| Other values (1165) | 41481 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 54764 | |
| n | 29578 | 8.0% |
| i | 25087 | 6.8% |
| e | 23322 | 6.3% |
| r | 21272 | 5.7% |
| o | 21243 | 5.7% |
| l | 17914 | 4.8% |
| t | 15922 | 4.3% |
| s | 14597 | 3.9% |
| 12708 | 3.4% | |
| Other values (74) | 134510 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 299227 | |
| Uppercase Letter | 53874 | 14.5% |
| Space Separator | 12708 | 3.4% |
| Dash Punctuation | 4340 | 1.2% |
| Other Punctuation | 610 | 0.2% |
| Open Punctuation | 79 | < 0.1% |
| Close Punctuation | 79 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 54764 | |
| n | 29578 | |
| i | 25087 | 8.4% |
| e | 23322 | 7.8% |
| r | 21272 | 7.1% |
| o | 21243 | 7.1% |
| l | 17914 | 6.0% |
| t | 15922 | 5.3% |
| s | 14597 | 4.9% |
| u | 11030 | 3.7% |
| Other values (40) | 64498 |
| Value | Count | Frequency (%) |
| C | 5699 | 10.6% |
| S | 5187 | 9.6% |
| A | 4167 | 7.7% |
| N | 3758 | 7.0% |
| M | 3079 | 5.7% |
| P | 3007 | 5.6% |
| B | 2684 | 5.0% |
| T | 2357 | 4.4% |
| W | 2286 | 4.2% |
| G | 1943 | 3.6% |
| Other values (18) | 19707 |
| Value | Count | Frequency (%) |
| ' | 571 | |
| . | 39 | 6.4% |
| Value | Count | Frequency (%) |
| 12708 |
| Value | Count | Frequency (%) |
| ( | 79 |
| Value | Count | Frequency (%) |
| ) | 79 |
| Value | Count | Frequency (%) |
| - | 4340 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 353101 | |
| Common | 17816 | 4.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 54764 | |
| n | 29578 | 8.4% |
| i | 25087 | 7.1% |
| e | 23322 | 6.6% |
| r | 21272 | 6.0% |
| o | 21243 | 6.0% |
| l | 17914 | 5.1% |
| t | 15922 | 4.5% |
| s | 14597 | 4.1% |
| u | 11030 | 3.1% |
| Other values (68) | 118372 |
| Value | Count | Frequency (%) |
| 12708 | ||
| - | 4340 | 24.4% |
| ' | 571 | 3.2% |
| ( | 79 | 0.4% |
| ) | 79 | 0.4% |
| . | 39 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 367458 | |
| None | 3377 | 0.9% |
| Latin Ext Additional | 82 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 54764 | |
| n | 29578 | 8.0% |
| i | 25087 | 6.8% |
| e | 23322 | 6.3% |
| r | 21272 | 5.8% |
| o | 21243 | 5.8% |
| l | 17914 | 4.9% |
| t | 15922 | 4.3% |
| s | 14597 | 4.0% |
| 12708 | 3.5% | |
| Other values (48) | 131051 |
| Value | Count | Frequency (%) |
| ủ | 30 | |
| ộ | 30 | |
| ỉ | 11 | 13.4% |
| ầ | 11 | 13.4% |
| Value | Count | Frequency (%) |
| é | 681 | |
| á | 669 | |
| í | 549 | |
| ô | 505 | |
| ã | 337 | |
| ó | 218 | 6.5% |
| ü | 204 | 6.0% |
| è | 50 | 1.5% |
| à | 39 | 1.2% |
| ä | 26 | 0.8% |
| Other values (12) | 99 | 2.9% |
| Distinct | 147 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| United States | |
|---|---|
| Australia | 2137 |
| France | 2114 |
| Mexico | 2012 |
| Germany | 1570 |
| Other values (142) |
Length
| Max length | 32 |
|---|---|
| Median length | 8 |
| Mean length | 8.849600998 |
| Min length | 4 |
Characters and Unicode
| Total characters | 340453 |
|---|---|
| Distinct characters | 54 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | United States |
|---|---|
| 2nd row | Netherlands |
| 3rd row | Cambodia |
| 4th row | Peru |
| 5th row | United Kingdom |
| Value | Count | Frequency (%) |
| United States | 7556 | |
| Australia | 2137 | 5.6% |
| France | 2114 | 5.5% |
| Mexico | 2012 | 5.2% |
| Germany | 1570 | 4.1% |
| China | 1400 | 3.6% |
| United Kingdom | 1220 | 3.2% |
| Brazil | 1208 | 3.1% |
| India | 1154 | 3.0% |
| Indonesia | 1057 | 2.7% |
| Other values (137) | 17043 |
| Value | Count | Frequency (%) |
| united | 8787 | 17.2% |
| states | 7556 | 14.8% |
| australia | 2137 | 4.2% |
| france | 2114 | 4.1% |
| mexico | 2012 | 3.9% |
| germany | 1570 | 3.1% |
| china | 1400 | 2.7% |
| kingdom | 1220 | 2.4% |
| brazil | 1208 | 2.4% |
| india | 1154 | 2.3% |
| Other values (154) | 22018 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 41325 | 12.1% |
| e | 32547 | 9.6% |
| t | 30534 | 9.0% |
| i | 30506 | 9.0% |
| n | 27638 | 8.1% |
| d | 16001 | 4.7% |
| r | 15216 | 4.5% |
| s | 13828 | 4.1% |
| 12705 | 3.7% | |
| o | 10881 | 3.2% |
| Other values (44) | 109272 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 276908 | |
| Uppercase Letter | 50547 | 14.8% |
| Space Separator | 12705 | 3.7% |
| Open Punctuation | 100 | < 0.1% |
| Close Punctuation | 100 | < 0.1% |
| Other Punctuation | 85 | < 0.1% |
| Dash Punctuation | 8 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 41325 | |
| e | 32547 | |
| t | 30534 | |
| i | 30506 | |
| n | 27638 | |
| d | 16001 | 5.8% |
| r | 15216 | 5.5% |
| s | 13828 | 5.0% |
| o | 10881 | 3.9% |
| l | 10138 | 3.7% |
| Other values (16) | 48294 |
| Value | Count | Frequency (%) |
| S | 10051 | |
| U | 9143 | |
| I | 3997 | 7.9% |
| A | 3616 | 7.2% |
| C | 3174 | 6.3% |
| M | 2806 | 5.6% |
| F | 2154 | 4.3% |
| G | 2125 | 4.2% |
| N | 2031 | 4.0% |
| B | 1762 | 3.5% |
| Other values (13) | 9688 |
| Value | Count | Frequency (%) |
| 12705 |
| Value | Count | Frequency (%) |
| ( | 100 |
| Value | Count | Frequency (%) |
| ) | 100 |
| Value | Count | Frequency (%) |
| ' | 85 |
| Value | Count | Frequency (%) |
| - | 8 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 327455 | |
| Common | 12998 | 3.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 41325 | |
| e | 32547 | 9.9% |
| t | 30534 | 9.3% |
| i | 30506 | 9.3% |
| n | 27638 | 8.4% |
| d | 16001 | 4.9% |
| r | 15216 | 4.6% |
| s | 13828 | 4.2% |
| o | 10881 | 3.3% |
| l | 10138 | 3.1% |
| Other values (39) | 98841 |
| Value | Count | Frequency (%) |
| 12705 | ||
| ( | 100 | 0.8% |
| ) | 100 | 0.8% |
| ' | 85 | 0.7% |
| - | 8 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 340453 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 41325 | 12.1% |
| e | 32547 | 9.6% |
| t | 30534 | 9.0% |
| i | 30506 | 9.0% |
| n | 27638 | 8.1% |
| d | 16001 | 4.7% |
| r | 15216 | 4.5% |
| s | 13828 | 4.1% |
| 12705 | 3.7% | |
| o | 10881 | 3.2% |
| Other values (44) | 109272 |
| Distinct | 609 |
|---|---|
| Distinct (%) | 8.1% |
| Missing | 30915 |
| Missing (%) | 80.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 55150.06392 |
|---|---|
| Minimum | 1040 |
| Maximum | 99301 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | 1040 |
|---|---|
| 5-th percentile | 10009 |
| Q1 | 23223 |
| median | 56301 |
| Q3 | 90008 |
| 95-th percentile | 97567 |
| Maximum | 99301 |
| Range | 98261 |
| Interquartile range (IQR) | 66785 |
Descriptive statistics
| Standard deviation | 32021.07257 |
|---|---|
| Coefficient of variation (CV) | 0.5806171433 |
| Kurtosis | -1.494324426 |
| Mean | 55150.06392 |
| Median Absolute Deviation (MAD) | 33703 |
| Skewness | -0.1265501212 |
| Sum | 416713883 |
| Variance | 1025349088 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 10035 | 202 | 0.5% |
| 10024 | 186 | 0.5% |
| 10009 | 172 | 0.4% |
| 94122 | 154 | 0.4% |
| 10011 | 147 | 0.4% |
| 19134 | 127 | 0.3% |
| 98105 | 126 | 0.3% |
| 90049 | 117 | 0.3% |
| 94110 | 117 | 0.3% |
| 98103 | 116 | 0.3% |
| Other values (599) | 6092 | 15.8% |
| (Missing) | 30915 |
| Value | Count | Frequency (%) |
| 1040 | 1 | < 0.1% |
| 1453 | 4 | < 0.1% |
| 1752 | 2 | < 0.1% |
| 1810 | 4 | < 0.1% |
| 1841 | 27 |
| Value | Count | Frequency (%) |
| 99301 | 3 | |
| 99207 | 5 | |
| 98661 | 4 | |
| 98632 | 3 | |
| 98502 | 5 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| APAC | |
|---|---|
| LATAM | |
| US | |
| EU | |
| EMEA | |
| Other values (2) |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 3.614098932 |
| Min length | 2 |
Characters and Unicode
| Total characters | 139038 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | US |
|---|---|
| 2nd row | EU |
| 3rd row | APAC |
| 4th row | LATAM |
| 5th row | EU |
| Value | Count | Frequency (%) |
| APAC | 8242 | |
| LATAM | 7734 | |
| US | 7556 | |
| EU | 7469 | |
| EMEA | 3735 | |
| Africa | 3450 | |
| Canada | 285 | 0.7% |
| Value | Count | Frequency (%) |
| apac | 8242 | |
| latam | 7734 | |
| us | 7556 | |
| eu | 7469 | |
| emea | 3735 | |
| africa | 3450 | |
| canada | 285 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 39137 | |
| U | 15025 | 10.8% |
| E | 14939 | 10.7% |
| M | 11469 | 8.2% |
| C | 8527 | 6.1% |
| P | 8242 | 5.9% |
| L | 7734 | 5.6% |
| T | 7734 | 5.6% |
| S | 7556 | 5.4% |
| a | 4305 | 3.1% |
| Other values (6) | 14370 | 10.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 120363 | |
| Lowercase Letter | 18675 | 13.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| A | 39137 | |
| U | 15025 | 12.5% |
| E | 14939 | 12.4% |
| M | 11469 | 9.5% |
| C | 8527 | 7.1% |
| P | 8242 | 6.8% |
| L | 7734 | 6.4% |
| T | 7734 | 6.4% |
| S | 7556 | 6.3% |
| Value | Count | Frequency (%) |
| a | 4305 | |
| f | 3450 | |
| r | 3450 | |
| i | 3450 | |
| c | 3450 | |
| n | 285 | 1.5% |
| d | 285 | 1.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 139038 |
Most frequent character per script
| Value | Count | Frequency (%) |
| A | 39137 | |
| U | 15025 | 10.8% |
| E | 14939 | 10.7% |
| M | 11469 | 8.2% |
| C | 8527 | 6.1% |
| P | 8242 | 5.9% |
| L | 7734 | 5.6% |
| T | 7734 | 5.6% |
| S | 7556 | 5.4% |
| a | 4305 | 3.1% |
| Other values (6) | 14370 | 10.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 139038 |
Most frequent character per block
| Value | Count | Frequency (%) |
| A | 39137 | |
| U | 15025 | 10.8% |
| E | 14939 | 10.7% |
| M | 11469 | 8.2% |
| C | 8527 | 6.1% |
| P | 8242 | 5.9% |
| L | 7734 | 5.6% |
| T | 7734 | 5.6% |
| S | 7556 | 5.4% |
| a | 4305 | 3.1% |
| Other values (6) | 14370 | 10.3% |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Central | |
|---|---|
| South | |
| EMEA | |
| North | |
| Africa | |
| Other values (8) |
Length
| Max length | 14 |
|---|---|
| Median length | 6 |
| Mean length | 6.634165995 |
| Min length | 4 |
Characters and Unicode
| Total characters | 255223 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | South |
|---|---|
| 2nd row | Central |
| 3rd row | Southeast Asia |
| 4th row | South |
| 5th row | North |
| Value | Count | Frequency (%) |
| Central | 8318 | |
| South | 5012 | |
| EMEA | 3735 | |
| North | 3606 | |
| Africa | 3450 | |
| Oceania | 2628 | 6.8% |
| West | 2412 | 6.3% |
| Southeast Asia | 2334 | 6.1% |
| East | 2148 | 5.6% |
| North Asia | 1741 | 4.5% |
| Other values (3) | 3087 | 8.0% |
| Value | Count | Frequency (%) |
| central | 9857 | |
| asia | 5614 | |
| north | 5347 | |
| south | 5012 | |
| emea | 3735 | 8.5% |
| africa | 3450 | 7.8% |
| oceania | 2628 | 6.0% |
| west | 2412 | 5.5% |
| southeast | 2334 | 5.3% |
| east | 2148 | 4.9% |
| Other values (2) | 1548 | 3.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 32040 | |
| t | 29444 | 11.5% |
| r | 19917 | 7.8% |
| e | 18494 | 7.2% |
| n | 14033 | 5.5% |
| i | 12955 | 5.1% |
| A | 12799 | 5.0% |
| o | 12693 | 5.0% |
| h | 12693 | 5.0% |
| s | 12508 | 4.9% |
| Other values (14) | 77647 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 194319 | |
| Uppercase Letter | 55290 | 21.7% |
| Space Separator | 5614 | 2.2% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 32040 | |
| t | 29444 | |
| r | 19917 | |
| e | 18494 | |
| n | 14033 | |
| i | 12955 | |
| o | 12693 | 6.5% |
| h | 12693 | 6.5% |
| s | 12508 | 6.4% |
| l | 9857 | 5.1% |
| Other values (5) | 19685 |
| Value | Count | Frequency (%) |
| A | 12799 | |
| C | 11405 | |
| E | 9618 | |
| S | 7346 | |
| N | 5347 | |
| M | 3735 | 6.8% |
| O | 2628 | 4.8% |
| W | 2412 | 4.4% |
| Value | Count | Frequency (%) |
| 5614 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 249609 | |
| Common | 5614 | 2.2% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 32040 | |
| t | 29444 | |
| r | 19917 | 8.0% |
| e | 18494 | 7.4% |
| n | 14033 | 5.6% |
| i | 12955 | 5.2% |
| A | 12799 | 5.1% |
| o | 12693 | 5.1% |
| h | 12693 | 5.1% |
| s | 12508 | 5.0% |
| Other values (13) | 72033 |
| Value | Count | Frequency (%) |
| 5614 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 255223 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 32040 | |
| t | 29444 | 11.5% |
| r | 19917 | 7.8% |
| e | 18494 | 7.2% |
| n | 14033 | 5.5% |
| i | 12955 | 5.1% |
| A | 12799 | 5.0% |
| o | 12693 | 5.0% |
| h | 12693 | 5.0% |
| s | 12508 | 4.9% |
| Other values (14) | 77647 |
| Distinct | 9815 |
|---|---|
| Distinct (%) | 25.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| OFF-AR-10003651 | 27 |
|---|---|
| OFF-AR-10003829 | 25 |
| OFF-BI-10003708 | 25 |
| OFF-BI-10004632 | 22 |
| OFF-BI-10002799 | 21 |
| Other values (9810) |
Length
| Max length | 16 |
|---|---|
| Median length | 15 |
| Mean length | 15.19422422 |
| Min length | 15 |
Characters and Unicode
| Total characters | 584537 |
|---|---|
| Distinct characters | 35 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1729 ? |
|---|---|
| Unique (%) | 4.5% |
Sample
| 1st row | TEC-AC-10004227 |
|---|---|
| 2nd row | OFF-LA-10003699 |
| 3rd row | FUR-BO-10000112 |
| 4th row | FUR-CH-10004338 |
| 5th row | OFF-ST-10001646 |
| Value | Count | Frequency (%) |
| OFF-AR-10003651 | 27 | 0.1% |
| OFF-AR-10003829 | 25 | 0.1% |
| OFF-BI-10003708 | 25 | 0.1% |
| OFF-BI-10004632 | 22 | 0.1% |
| OFF-BI-10002799 | 21 | 0.1% |
| OFF-BI-10000542 | 21 | 0.1% |
| FUR-CH-10003354 | 20 | 0.1% |
| OFF-BI-10004140 | 20 | 0.1% |
| OFF-BI-10003650 | 19 | < 0.1% |
| OFF-BI-10002570 | 19 | < 0.1% |
| Other values (9805) | 38252 |
| Value | Count | Frequency (%) |
| tec-hp | 65 | 0.2% |
| off-ar-10003651 | 27 | 0.1% |
| off-bi-10003708 | 25 | 0.1% |
| off-ar-10003829 | 25 | 0.1% |
| off-bi-10004632 | 22 | 0.1% |
| off-bi-10002799 | 21 | 0.1% |
| off-bi-10000542 | 21 | 0.1% |
| fur-ch-10003354 | 20 | 0.1% |
| off-bi-10004140 | 20 | 0.1% |
| off-bi-10003650 | 19 | < 0.1% |
| Other values (9806) | 38271 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 134600 | |
| - | 76942 | |
| F | 58466 | |
| 1 | 57892 | |
| O | 27809 | 4.8% |
| 2 | 19214 | 3.3% |
| 3 | 19162 | 3.3% |
| 4 | 18964 | 3.2% |
| A | 15211 | 2.6% |
| C | 14404 | 2.5% |
| Other values (25) | 141873 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 307768 | |
| Uppercase Letter | 199762 | |
| Dash Punctuation | 76942 | 13.2% |
| Space Separator | 65 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| F | 58466 | |
| O | 27809 | |
| A | 15211 | 7.6% |
| C | 14404 | 7.2% |
| T | 12026 | 6.0% |
| E | 11519 | 5.8% |
| U | 11185 | 5.6% |
| R | 11037 | 5.5% |
| S | 6338 | 3.2% |
| B | 6164 | 3.1% |
| Other values (13) | 25603 |
| Value | Count | Frequency (%) |
| 0 | 134600 | |
| 1 | 57892 | |
| 2 | 19214 | 6.2% |
| 3 | 19162 | 6.2% |
| 4 | 18964 | 6.2% |
| 5 | 12164 | 4.0% |
| 7 | 11675 | 3.8% |
| 9 | 11542 | 3.8% |
| 8 | 11500 | 3.7% |
| 6 | 11055 | 3.6% |
| Value | Count | Frequency (%) |
| - | 76942 |
| Value | Count | Frequency (%) |
| 65 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 384775 | |
| Latin | 199762 |
Most frequent character per script
| Value | Count | Frequency (%) |
| F | 58466 | |
| O | 27809 | |
| A | 15211 | 7.6% |
| C | 14404 | 7.2% |
| T | 12026 | 6.0% |
| E | 11519 | 5.8% |
| U | 11185 | 5.6% |
| R | 11037 | 5.5% |
| S | 6338 | 3.2% |
| B | 6164 | 3.1% |
| Other values (13) | 25603 |
| Value | Count | Frequency (%) |
| 0 | 134600 | |
| - | 76942 | |
| 1 | 57892 | |
| 2 | 19214 | 5.0% |
| 3 | 19162 | 5.0% |
| 4 | 18964 | 4.9% |
| 5 | 12164 | 3.2% |
| 7 | 11675 | 3.0% |
| 9 | 11542 | 3.0% |
| 8 | 11500 | 3.0% |
| Other values (2) | 11120 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 584537 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 134600 | |
| - | 76942 | |
| F | 58466 | |
| 1 | 57892 | |
| O | 27809 | 4.8% |
| 2 | 19214 | 3.3% |
| 3 | 19162 | 3.3% |
| 4 | 18964 | 3.2% |
| A | 15211 | 2.6% |
| C | 14404 | 2.5% |
| Other values (25) | 141873 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Office Supplies | |
|---|---|
| Technology | |
| Furniture |
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 12.85248629 |
| Min length | 9 |
Characters and Unicode
| Total characters | 494448 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Technology |
|---|---|
| 2nd row | Office Supplies |
| 3rd row | Furniture |
| 4th row | Furniture |
| 5th row | Office Supplies |
| Value | Count | Frequency (%) |
| Office Supplies | 23436 | |
| Technology | 7593 | 19.7% |
| Furniture | 7442 | 19.3% |
| Value | Count | Frequency (%) |
| office | 23436 | |
| supplies | 23436 | |
| technology | 7593 | 12.3% |
| furniture | 7442 | 12.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 61907 | |
| i | 54314 | |
| f | 46872 | |
| p | 46872 | |
| u | 38320 | 7.8% |
| c | 31029 | 6.3% |
| l | 31029 | 6.3% |
| O | 23436 | 4.7% |
| 23436 | 4.7% | |
| S | 23436 | 4.7% |
| Other values (10) | 113797 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 409105 | |
| Uppercase Letter | 61907 | 12.5% |
| Space Separator | 23436 | 4.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 61907 | |
| i | 54314 | |
| f | 46872 | |
| p | 46872 | |
| u | 38320 | |
| c | 31029 | |
| l | 31029 | |
| s | 23436 | 5.7% |
| o | 15186 | 3.7% |
| n | 15035 | 3.7% |
| Other values (5) | 45105 |
| Value | Count | Frequency (%) |
| O | 23436 | |
| S | 23436 | |
| T | 7593 | 12.3% |
| F | 7442 | 12.0% |
| Value | Count | Frequency (%) |
| 23436 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 471012 | |
| Common | 23436 | 4.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 61907 | |
| i | 54314 | |
| f | 46872 | |
| p | 46872 | |
| u | 38320 | |
| c | 31029 | 6.6% |
| l | 31029 | 6.6% |
| O | 23436 | 5.0% |
| S | 23436 | 5.0% |
| s | 23436 | 5.0% |
| Other values (9) | 90361 |
| Value | Count | Frequency (%) |
| 23436 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 494448 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 61907 | |
| i | 54314 | |
| f | 46872 | |
| p | 46872 | |
| u | 38320 | 7.8% |
| c | 31029 | 6.3% |
| l | 31029 | 6.3% |
| O | 23436 | 4.7% |
| 23436 | 4.7% | |
| S | 23436 | 4.7% |
| Other values (10) | 113797 |
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Binders | |
|---|---|
| Storage | |
| Art | |
| Paper | |
| Chairs | |
| Other values (12) |
Length
| Max length | 11 |
|---|---|
| Median length | 7 |
| Mean length | 7.23516415 |
| Min length | 3 |
Characters and Unicode
| Total characters | 278344 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Accessories |
|---|---|
| 2nd row | Labels |
| 3rd row | Bookcases |
| 4th row | Chairs |
| 5th row | Storage |
| Value | Count | Frequency (%) |
| Binders | 4614 | |
| Storage | 3802 | 9.9% |
| Art | 3639 | 9.5% |
| Paper | 2663 | 6.9% |
| Chairs | 2576 | 6.7% |
| Phones | 2547 | 6.6% |
| Furnishings | 2372 | 6.2% |
| Accessories | 2344 | 6.1% |
| Labels | 1950 | 5.1% |
| Bookcases | 1835 | 4.8% |
| Other values (7) | 10129 |
| Value | Count | Frequency (%) |
| binders | 4614 | |
| storage | 3802 | 9.9% |
| art | 3639 | 9.5% |
| paper | 2663 | 6.9% |
| chairs | 2576 | 6.7% |
| phones | 2547 | 6.6% |
| furnishings | 2372 | 6.2% |
| accessories | 2344 | 6.1% |
| labels | 1950 | 5.1% |
| bookcases | 1835 | 4.8% |
| Other values (7) | 10129 |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 39070 | |
| e | 35865 | |
| r | 25432 | 9.1% |
| i | 20111 | 7.2% |
| n | 17947 | 6.4% |
| a | 17698 | 6.4% |
| o | 15806 | 5.7% |
| p | 12368 | 4.4% |
| t | 9249 | 3.3% |
| c | 8928 | 3.2% |
| Other values (18) | 75870 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 239873 | |
| Uppercase Letter | 38471 | 13.8% |
Most frequent character per category
| Value | Count | Frequency (%) |
| s | 39070 | |
| e | 35865 | |
| r | 25432 | |
| i | 20111 | |
| n | 17947 | |
| a | 17698 | |
| o | 15806 | |
| p | 12368 | 5.2% |
| t | 9249 | 3.9% |
| c | 8928 | 3.7% |
| Other values (8) | 37399 |
| Value | Count | Frequency (%) |
| A | 7300 | |
| B | 6449 | |
| S | 5616 | |
| P | 5210 | |
| C | 4190 | |
| F | 4180 | |
| L | 1950 | 5.1% |
| E | 1829 | 4.8% |
| M | 1088 | 2.8% |
| T | 659 | 1.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 278344 |
Most frequent character per script
| Value | Count | Frequency (%) |
| s | 39070 | |
| e | 35865 | |
| r | 25432 | 9.1% |
| i | 20111 | 7.2% |
| n | 17947 | 6.4% |
| a | 17698 | 6.4% |
| o | 15806 | 5.7% |
| p | 12368 | 4.4% |
| t | 9249 | 3.3% |
| c | 8928 | 3.2% |
| Other values (18) | 75870 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 278344 |
Most frequent character per block
| Value | Count | Frequency (%) |
| s | 39070 | |
| e | 35865 | |
| r | 25432 | 9.1% |
| i | 20111 | 7.2% |
| n | 17947 | 6.4% |
| a | 17698 | 6.4% |
| o | 15806 | 5.7% |
| p | 12368 | 4.4% |
| t | 9249 | 3.3% |
| c | 8928 | 3.2% |
| Other values (18) | 75870 |
| Distinct | 3750 |
|---|---|
| Distinct (%) | 9.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Staples | 165 |
|---|---|
| Cardinal Index Tab, Clear | 72 |
| Ibico Index Tab, Clear | 67 |
| Eldon File Cart, Single Width | 66 |
| Smead File Cart, Single Width | 63 |
| Other values (3745) |
Length
| Max length | 127 |
|---|---|
| Median length | 29 |
| Mean length | 30.90800863 |
| Min length | 5 |
Characters and Unicode
| Total characters | 1189062 |
|---|---|
| Distinct characters | 85 |
| Distinct categories | 12 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 3 ? |
Unique
| Unique | 181 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | SanDisk Ultra 16 GB MicroSDHC Class 10 Memory Card |
|---|---|
| 2nd row | Smead File Folder Labels, Adjustable |
| 3rd row | Dania Corner Shelving, Pine |
| 4th row | Hon Bag Chairs, Red |
| 5th row | Fellowes Box, Wire Frame |
| Value | Count | Frequency (%) |
| Staples | 165 | 0.4% |
| Cardinal Index Tab, Clear | 72 | 0.2% |
| Ibico Index Tab, Clear | 67 | 0.2% |
| Eldon File Cart, Single Width | 66 | 0.2% |
| Smead File Cart, Single Width | 63 | 0.2% |
| Sanford Pencil Sharpener, Water Color | 60 | 0.2% |
| Acco Index Tab, Clear | 59 | 0.2% |
| Rogers File Cart, Single Width | 57 | 0.1% |
| Tenex File Cart, Single Width | 53 | 0.1% |
| Stanley Pencil Sharpener, Water Color | 50 | 0.1% |
| Other values (3740) | 37759 |
| Value | Count | Frequency (%) |
| labels | 1783 | 1.0% |
| recycled | 1729 | 1.0% |
| with | 1672 | 1.0% |
| set | 1599 | 0.9% |
| color | 1580 | 0.9% |
| blue | 1568 | 0.9% |
| durable | 1561 | 0.9% |
| black | 1528 | 0.9% |
| avery | 1464 | 0.8% |
| clear | 1414 | 0.8% |
| Other values (2797) | 158116 |
Most occurring characters
| Value | Count | Frequency (%) |
| 135239 | 11.4% | |
| e | 116177 | 9.8% |
| a | 70919 | 6.0% |
| r | 68645 | 5.8% |
| o | 66508 | 5.6% |
| l | 60120 | 5.1% |
| i | 59436 | 5.0% |
| n | 50986 | 4.3% |
| t | 47003 | 4.0% |
| s | 45347 | 3.8% |
| Other values (75) | 468682 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 814433 | |
| Uppercase Letter | 176764 | 14.9% |
| Space Separator | 135585 | 11.4% |
| Other Punctuation | 37615 | 3.2% |
| Decimal Number | 19537 | 1.6% |
| Dash Punctuation | 4945 | 0.4% |
| Open Punctuation | 47 | < 0.1% |
| Close Punctuation | 47 | < 0.1% |
| Final Punctuation | 46 | < 0.1% |
| Math Symbol | 26 | < 0.1% |
| Other values (2) | 17 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 116177 | |
| a | 70919 | 8.7% |
| r | 68645 | 8.4% |
| o | 66508 | 8.2% |
| l | 60120 | 7.4% |
| i | 59436 | 7.3% |
| n | 50986 | 6.3% |
| t | 47003 | 5.8% |
| s | 45347 | 5.6% |
| c | 32468 | 4.0% |
| Other values (18) | 196824 |
| Value | Count | Frequency (%) |
| S | 24995 | |
| C | 20660 | |
| B | 17076 | 9.7% |
| P | 13620 | 7.7% |
| E | 9728 | 5.5% |
| A | 9333 | 5.3% |
| F | 9227 | 5.2% |
| M | 8017 | 4.5% |
| R | 7798 | 4.4% |
| T | 7576 | 4.3% |
| Other values (16) | 48734 |
| Value | Count | Frequency (%) |
| , | 33277 | |
| / | 1197 | 3.2% |
| & | 1070 | 2.8% |
| " | 986 | 2.6% |
| . | 778 | 2.1% |
| ' | 181 | 0.5% |
| # | 70 | 0.2% |
| % | 35 | 0.1% |
| * | 7 | < 0.1% |
| ! | 7 | < 0.1% |
| Other values (2) | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 4091 | |
| 0 | 3939 | |
| 5 | 2343 | |
| 2 | 2082 | |
| 3 | 2012 | |
| 8 | 1389 | 7.1% |
| 4 | 1352 | 6.9% |
| 9 | 929 | 4.8% |
| 6 | 726 | 3.7% |
| 7 | 674 | 3.4% |
| Value | Count | Frequency (%) |
| 135239 | ||
| 346 | 0.3% |
| Value | Count | Frequency (%) |
| - | 4945 |
| Value | Count | Frequency (%) |
| ( | 47 |
| Value | Count | Frequency (%) |
| ) | 47 |
| Value | Count | Frequency (%) |
| ¾ | 3 |
| Value | Count | Frequency (%) |
| ” | 46 |
| Value | Count | Frequency (%) |
| + | 26 |
| Value | Count | Frequency (%) |
| “ | 14 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 991197 | |
| Common | 197865 | 16.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 116177 | 11.7% |
| a | 70919 | 7.2% |
| r | 68645 | 6.9% |
| o | 66508 | 6.7% |
| l | 60120 | 6.1% |
| i | 59436 | 6.0% |
| n | 50986 | 5.1% |
| t | 47003 | 4.7% |
| s | 45347 | 4.6% |
| c | 32468 | 3.3% |
| Other values (44) | 373588 |
| Value | Count | Frequency (%) |
| 135239 | ||
| , | 33277 | 16.8% |
| - | 4945 | 2.5% |
| 1 | 4091 | 2.1% |
| 0 | 3939 | 2.0% |
| 5 | 2343 | 1.2% |
| 2 | 2082 | 1.1% |
| 3 | 2012 | 1.0% |
| 8 | 1389 | 0.7% |
| 4 | 1352 | 0.7% |
| Other values (21) | 7196 | 3.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1188637 | |
| None | 365 | < 0.1% |
| Punctuation | 60 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| 135239 | 11.4% | |
| e | 116177 | 9.8% |
| a | 70919 | 6.0% |
| r | 68645 | 5.8% |
| o | 66508 | 5.6% |
| l | 60120 | 5.1% |
| i | 59436 | 5.0% |
| n | 50986 | 4.3% |
| t | 47003 | 4.0% |
| s | 45347 | 3.8% |
| Other values (69) | 468257 |
| Value | Count | Frequency (%) |
| 346 | ||
| é | 14 | 3.8% |
| ¾ | 3 | 0.8% |
| à | 2 | 0.5% |
| Value | Count | Frequency (%) |
| ” | 46 | |
| “ | 14 | 23.3% |
Sales
Real number (ℝ≥0)
| Distinct | 22491 |
|---|---|
| Distinct (%) | 58.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 246.182845 |
|---|---|
| Minimum | 0.444 |
| Maximum | 22638.48 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | 0.444 |
|---|---|
| 5-th percentile | 8.72 |
| Q1 | 30.69 |
| median | 85.44 |
| Q3 | 250.8972 |
| 95-th percentile | 1008.10134 |
| Maximum | 22638.48 |
| Range | 22638.036 |
| Interquartile range (IQR) | 220.2072 |
Descriptive statistics
| Standard deviation | 493.7178054 |
|---|---|
| Coefficient of variation (CV) | 2.005492322 |
| Kurtosis | 211.2107054 |
| Mean | 246.182845 |
| Median Absolute Deviation (MAD) | 67.26 |
| Skewness | 8.949622475 |
| Sum | 9470900.23 |
| Variance | 243757.2714 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 12.96 | 38 | 0.1% |
| 19.44 | 31 | 0.1% |
| 24 | 29 | 0.1% |
| 10.368 | 28 | 0.1% |
| 32.4 | 26 | 0.1% |
| 25.92 | 26 | 0.1% |
| 17.52 | 25 | 0.1% |
| 15.552 | 24 | 0.1% |
| 27.96 | 23 | 0.1% |
| 12.36 | 20 | 0.1% |
| Other values (22481) | 38201 |
| Value | Count | Frequency (%) |
| 0.444 | 1 | |
| 0.556 | 1 | |
| 0.836 | 1 | |
| 0.984 | 1 | |
| 0.99 | 1 |
| Value | Count | Frequency (%) |
| 22638.48 | 1 | |
| 17499.95 | 1 | |
| 13999.96 | 1 | |
| 11199.968 | 1 | |
| 10499.97 | 1 |
Quantity
Real number (ℝ≥0)
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.475630995 |
|---|---|
| Minimum | 1 |
| Maximum | 14 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 8 |
| Maximum | 14 |
| Range | 13 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.280458594 |
|---|---|
| Coefficient of variation (CV) | 0.6561279368 |
| Kurtosis | 2.331191556 |
| Mean | 3.475630995 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.371360107 |
| Sum | 133711 |
| Variance | 5.200491397 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 9598 | |
| 3 | 7260 | |
| 1 | 6706 | |
| 4 | 4797 | |
| 5 | 3652 | 9.5% |
| 6 | 2243 | 5.8% |
| 7 | 1806 | 4.7% |
| 8 | 1013 | 2.6% |
| 9 | 738 | 1.9% |
| 10 | 198 | 0.5% |
| Other values (4) | 460 | 1.2% |
| Value | Count | Frequency (%) |
| 1 | 6706 | |
| 2 | 9598 | |
| 3 | 7260 | |
| 4 | 4797 | |
| 5 | 3652 | 9.5% |
| Value | Count | Frequency (%) |
| 14 | 147 | |
| 13 | 62 | 0.2% |
| 12 | 130 | |
| 11 | 121 | |
| 10 | 198 |
| Distinct | 29 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1431364924 |
|---|---|
| Minimum | 0 |
| Maximum | 0.85 |
| Zeros | 21767 |
| Zeros (%) | 56.6% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.2 |
| 95-th percentile | 0.6 |
| Maximum | 0.85 |
| Range | 0.85 |
| Interquartile range (IQR) | 0.2 |
Descriptive statistics
| Standard deviation | 0.2124342108 |
|---|---|
| Coefficient of variation (CV) | 1.484137323 |
| Kurtosis | 0.7146596297 |
| Mean | 0.1431364924 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.386732947 |
| Sum | 5506.604 |
| Variance | 0.0451282939 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 21767 | |
| 0.2 | 3781 | 9.8% |
| 0.1 | 3057 | 7.9% |
| 0.4 | 2382 | 6.2% |
| 0.6 | 1509 | 3.9% |
| 0.7 | 1343 | 3.5% |
| 0.5 | 1217 | 3.2% |
| 0.17 | 558 | 1.5% |
| 0.47 | 544 | 1.4% |
| 0.15 | 333 | 0.9% |
| Other values (19) | 1980 | 5.1% |
| Value | Count | Frequency (%) |
| 0 | 21767 | |
| 0.002 | 316 | 0.8% |
| 0.07 | 104 | 0.3% |
| 0.1 | 3057 | 7.9% |
| 0.15 | 333 | 0.9% |
| Value | Count | Frequency (%) |
| 0.85 | 1 | < 0.1% |
| 0.8 | 242 | 0.6% |
| 0.7 | 1343 | |
| 0.65 | 14 | < 0.1% |
| 0.602 | 14 | < 0.1% |
| Distinct | 22674 |
|---|---|
| Distinct (%) | 58.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.81932709 |
|---|---|
| Minimum | -4088.376 |
| Maximum | 8399.976 |
| Zeros | 510 |
| Zeros (%) | 1.3% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | -4088.376 |
|---|---|
| 5-th percentile | -84.765 |
| Q1 | 0 |
| median | 9.27 |
| Q3 | 36.97965 |
| 95-th percentile | 211.887 |
| Maximum | 8399.976 |
| Range | 12488.352 |
| Interquartile range (IQR) | 36.97965 |
Descriptive statistics
| Standard deviation | 177.1409931 |
|---|---|
| Coefficient of variation (CV) | 6.146604069 |
| Kurtosis | 301.2305522 |
| Mean | 28.81932709 |
| Median Absolute Deviation (MAD) | 16.05 |
| Skewness | 6.426160608 |
| Sum | 1108708.332 |
| Variance | 31378.93145 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 510 | 1.3% |
| 7.92 | 47 | 0.1% |
| 3.96 | 44 | 0.1% |
| 4.32 | 44 | 0.1% |
| 5.28 | 42 | 0.1% |
| 2.97 | 41 | 0.1% |
| 9 | 41 | 0.1% |
| 2.64 | 41 | 0.1% |
| 2.88 | 39 | 0.1% |
| 1.26 | 37 | 0.1% |
| Other values (22664) | 37585 |
| Value | Count | Frequency (%) |
| -4088.376 | 1 | |
| -3839.9904 | 1 | |
| -3701.8928 | 1 | |
| -3399.98 | 1 | |
| -3009.435 | 1 |
| Value | Count | Frequency (%) |
| 8399.976 | 1 | |
| 6719.9808 | 1 | |
| 5039.9856 | 1 | |
| 4946.37 | 1 | |
| 4630.4755 | 1 |
Shipping Cost
Real number (ℝ≥0)
| Distinct | 14135 |
|---|---|
| Distinct (%) | 36.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.30097417 |
|---|---|
| Minimum | 0.002 |
| Maximum | 933.57 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | 0.002 |
|---|---|
| 5-th percentile | 0.6 |
| Q1 | 2.6 |
| median | 7.82 |
| Q3 | 24.43 |
| 95-th percentile | 111.19 |
| Maximum | 933.57 |
| Range | 933.568 |
| Interquartile range (IQR) | 21.83 |
Descriptive statistics
| Standard deviation | 57.31690842 |
|---|---|
| Coefficient of variation (CV) | 2.179269408 |
| Kurtosis | 50.44812565 |
| Mean | 26.30097417 |
| Median Absolute Deviation (MAD) | 6.44 |
| Skewness | 5.906416938 |
| Sum | 1011824.777 |
| Variance | 3285.22799 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.35 | 54 | 0.1% |
| 0.64 | 52 | 0.1% |
| 0.86 | 49 | 0.1% |
| 0.71 | 48 | 0.1% |
| 0.79 | 47 | 0.1% |
| 1.75 | 45 | 0.1% |
| 1.26 | 45 | 0.1% |
| 0.85 | 44 | 0.1% |
| 0.67 | 44 | 0.1% |
| 0.98 | 43 | 0.1% |
| Other values (14125) | 38000 |
| Value | Count | Frequency (%) |
| 0.002 | 1 | < 0.1% |
| 0.003 | 1 | < 0.1% |
| 0.01 | 5 | |
| 0.019 | 1 | < 0.1% |
| 0.02 | 3 |
| Value | Count | Frequency (%) |
| 933.57 | 1 | |
| 915.49 | 1 | |
| 910.16 | 1 | |
| 897.35 | 1 | |
| 867.69 | 1 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 300.7 KiB |
| Medium | |
|---|---|
| High | |
| Critical | |
| Low | 1840 |
Length
| Max length | 8 |
|---|---|
| Median length | 6 |
| Mean length | 5.404746432 |
| Min length | 3 |
Characters and Unicode
| Total characters | 207926 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Medium |
|---|---|
| 2nd row | Low |
| 3rd row | Medium |
| 4th row | Medium |
| 5th row | Medium |
| Value | Count | Frequency (%) |
| Medium | 22087 | |
| High | 11617 | |
| Critical | 2927 | 7.6% |
| Low | 1840 | 4.8% |
| Value | Count | Frequency (%) |
| medium | 22087 | |
| high | 11617 | |
| critical | 2927 | 7.6% |
| low | 1840 | 4.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 39558 | |
| M | 22087 | |
| e | 22087 | |
| d | 22087 | |
| u | 22087 | |
| m | 22087 | |
| H | 11617 | 5.6% |
| g | 11617 | 5.6% |
| h | 11617 | 5.6% |
| C | 2927 | 1.4% |
| Other values (8) | 20155 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 169455 | |
| Uppercase Letter | 38471 | 18.5% |
Most frequent character per category
| Value | Count | Frequency (%) |
| i | 39558 | |
| e | 22087 | |
| d | 22087 | |
| u | 22087 | |
| m | 22087 | |
| g | 11617 | 6.9% |
| h | 11617 | 6.9% |
| r | 2927 | 1.7% |
| t | 2927 | 1.7% |
| c | 2927 | 1.7% |
| Other values (4) | 9534 | 5.6% |
| Value | Count | Frequency (%) |
| M | 22087 | |
| H | 11617 | |
| C | 2927 | 7.6% |
| L | 1840 | 4.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 207926 |
Most frequent character per script
| Value | Count | Frequency (%) |
| i | 39558 | |
| M | 22087 | |
| e | 22087 | |
| d | 22087 | |
| u | 22087 | |
| m | 22087 | |
| H | 11617 | 5.6% |
| g | 11617 | 5.6% |
| h | 11617 | 5.6% |
| C | 2927 | 1.4% |
| Other values (8) | 20155 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 207926 |
Most frequent character per block
| Value | Count | Frequency (%) |
| i | 39558 | |
| M | 22087 | |
| e | 22087 | |
| d | 22087 | |
| u | 22087 | |
| m | 22087 | |
| H | 11617 | 5.6% |
| g | 11617 | 5.6% |
| h | 11617 | 5.6% |
| C | 2927 | 1.4% |
| Other values (8) | 20155 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 36159 |
| Missing (%) | 94.0% |
| Memory size | 75.3 KiB |
| True | 2312 |
|---|---|
| (Missing) |
| Value | Count | Frequency (%) |
| True | 2312 | 6.0% |
| (Missing) | 36159 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.974084375 |
|---|---|
| Minimum | 0 |
| Maximum | 7 |
| Zeros | 1987 |
| Zeros (%) | 5.2% |
| Memory size | 300.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 4 |
| Q3 | 5 |
| 95-th percentile | 7 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.737409026 |
|---|---|
| Coefficient of variation (CV) | 0.4371847355 |
| Kurtosis | -0.2569962854 |
| Mean | 3.974084375 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.4352293524 |
| Sum | 152887 |
| Variance | 3.018590125 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 10782 | |
| 5 | 8380 | |
| 2 | 5241 | |
| 6 | 4754 | |
| 3 | 3743 | 9.7% |
| 7 | 2340 | 6.1% |
| 0 | 1987 | 5.2% |
| 1 | 1244 | 3.2% |
| Value | Count | Frequency (%) |
| 0 | 1987 | 5.2% |
| 1 | 1244 | 3.2% |
| 2 | 5241 | |
| 3 | 3743 | 9.7% |
| 4 | 10782 |
| Value | Count | Frequency (%) |
| 7 | 2340 | 6.1% |
| 6 | 4754 | |
| 5 | 8380 | |
| 4 | 10782 | |
| 3 | 3743 | 9.7% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | Order Date | Ship Date | Ship Mode | Customer ID | Customer Name | Segment | City | State | Country | Postal Code | Market | Region | Product ID | Category | Sub-Category | Product Name | Sales | Quantity | Discount | Profit | Shipping Cost | Order Priority | Returned | ship_delay | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 27507 | 2012-12-27 | 2012-12-31 | Standard Class | SP-20860 | a7d03c30d416fc5f7d695b495884fdd7 | Corporate | Murfreesboro | Tennessee | United States | 37130.0 | US | South | TEC-AC-10004227 | Technology | Accessories | SanDisk Ultra 16 GB MicroSDHC Class 10 Memory Card | 72.7440 | 7 | 0.20 | -12.7302 | 6.720 | Medium | NaN | 4 |
| 1 | 35511 | 2014-12-25 | 2015-01-01 | Standard Class | JD-15895 | 1b2850c124acd1bc24237b4b5228b65e | Corporate | Oosterhout | North Brabant | Netherlands | NaN | EU | Central | OFF-LA-10003699 | Office Supplies | Labels | Smead File Folder Labels, Adjustable | 23.7300 | 7 | 0.50 | -21.0000 | 3.430 | Low | NaN | 7 |
| 2 | 9172 | 2012-05-08 | 2012-05-11 | Second Class | AB-10600 | 6acab08bb2b385c8569adfd24730ee01 | Corporate | Phnom Penh | Phnom Penh | Cambodia | NaN | APAC | Southeast Asia | FUR-BO-10000112 | Furniture | Bookcases | Dania Corner Shelving, Pine | 617.1000 | 5 | 0.00 | 172.6500 | 36.380 | Medium | NaN | 3 |
| 3 | 31366 | 2011-06-30 | 2011-07-02 | Second Class | GH-14410 | 1528a0a296f3ecf500753855ea9a21a5 | Home Office | Lima | Lima (city) | Peru | NaN | LATAM | South | FUR-CH-10004338 | Furniture | Chairs | Hon Bag Chairs, Red | 54.1800 | 3 | 0.40 | -32.5200 | 4.919 | Medium | Yes | 2 |
| 4 | 24465 | 2013-06-23 | 2013-06-26 | First Class | KW-16435 | 648a7c6f93ee0f453ee1378466a84ff8 | Consumer | London | England | United Kingdom | NaN | EU | North | OFF-ST-10001646 | Office Supplies | Storage | Fellowes Box, Wire Frame | 50.6250 | 3 | 0.10 | 20.2050 | 8.570 | Medium | NaN | 3 |
| 5 | 30265 | 2013-05-23 | 2013-05-26 | First Class | FC-14245 | c7ee4888116b2f40fd3fa048c57f93c9 | Home Office | Santiago de los Caballeros | Santiago | Dominican Republic | NaN | LATAM | Caribbean | OFF-AP-10001885 | Office Supplies | Appliances | Hamilton Beach Coffee Grinder, White | 43.5200 | 2 | 0.20 | -7.0800 | 5.377 | High | NaN | 3 |
| 6 | 38009 | 2014-10-31 | 2014-11-02 | Second Class | AJ-10780 | 5d7c7e88c8e01ea1ec06adaf52008919 | Corporate | Managua | Managua | Nicaragua | NaN | LATAM | Central | OFF-PA-10000108 | Office Supplies | Paper | Green Bar Parchment Paper, 8.5 x 11 | 27.7600 | 2 | 0.00 | 13.8800 | 2.737 | Medium | NaN | 2 |
| 7 | 40266 | 2013-11-11 | 2013-11-16 | Standard Class | BN-11515 | 583495e45655d0533f2d2b772d823971 | Consumer | Hanoi | Thủ Dô Hà Nội | Vietnam | NaN | APAC | Southeast Asia | OFF-LA-10002992 | Office Supplies | Labels | Novimex Removable Labels, 5000 Label Set | 32.7684 | 4 | 0.17 | -6.3516 | 2.180 | Medium | Yes | 5 |
| 8 | 24871 | 2013-02-27 | 2013-03-04 | Standard Class | CC-12550 | 0515ed679a66bff59a161a28317b6bd4 | Consumer | Broken Hill | New South Wales | Australia | NaN | APAC | Oceania | OFF-ST-10004015 | Office Supplies | Storage | Smead Trays, Blue | 130.8960 | 3 | 0.10 | 56.6460 | 8.300 | Medium | NaN | 5 |
| 9 | 12001 | 2012-06-25 | 2012-06-30 | Second Class | JE-15475 | 45d82b7ca3728955400b9b342ec412dc | Consumer | La Seyne-sur-Mer | Provence-Alpes-Côte d'Azur | France | NaN | EU | Central | TEC-AC-10004883 | Technology | Accessories | Enermax Keyboard, Programmable | 254.8800 | 3 | 0.00 | 109.5300 | 26.590 | Medium | NaN | 5 |
Last rows
| df_index | Order Date | Ship Date | Ship Mode | Customer ID | Customer Name | Segment | City | State | Country | Postal Code | Market | Region | Product ID | Category | Sub-Category | Product Name | Sales | Quantity | Discount | Profit | Shipping Cost | Order Priority | Returned | ship_delay | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 38461 | 47191 | 2014-12-04 | 2014-12-08 | Standard Class | KH-16630 | 2e326a96a0a174fbabf7b2153c86a3c6 | Corporate | Panama City | Panama | Panama | NaN | LATAM | Central | OFF-FA-10001700 | Office Supplies | Fasteners | Accos Rubber Bands, Bulk Pack | 6.67200 | 1 | 0.400 | -0.12800 | 0.862 | High | NaN | 4 |
| 38462 | 21962 | 2014-07-11 | 2014-07-18 | Standard Class | GM-14695 | d4d0b11cd9b34e92c4a1dda4340de9f9 | Corporate | Melbourne | Victoria | Australia | NaN | APAC | Oceania | FUR-FU-10004503 | Furniture | Furnishings | Tenex Photo Frame, Black | 139.96800 | 3 | 0.100 | 43.48800 | 10.590 | Medium | NaN | 7 |
| 38463 | 37194 | 2013-05-25 | 2013-05-31 | Standard Class | GM-4440 | 3d1c57189f23bee3e938826f6556219c | Consumer | Beni Suef | Bani Suwayf | Egypt | NaN | Africa | Africa | OFF-CAR-10002031 | Office Supplies | Binders | Cardinal 3-Hole Punch, Durable | 121.20000 | 4 | 0.000 | 55.68000 | 2.960 | Medium | NaN | 6 |
| 38464 | 16850 | 2012-06-08 | 2012-06-13 | Second Class | JG-15160 | c6c64d1801f997e4e3628ed416cd160e | Consumer | Widnes | England | United Kingdom | NaN | EU | North | OFF-AR-10003384 | Office Supplies | Art | Boston Pens, Water Color | 104.40000 | 6 | 0.000 | 9.36000 | 16.560 | High | NaN | 5 |
| 38465 | 6265 | 2011-05-10 | 2011-05-10 | Same Day | BW-11110 | 2d806890acc865414ad191e4f11ec62a | Corporate | Barcelona | Catalonia | Spain | NaN | EU | South | TEC-MA-10003078 | Technology | Machines | Epson Printer, White | 469.85400 | 2 | 0.100 | -31.32600 | 54.130 | Critical | NaN | 0 |
| 38466 | 11284 | 2014-07-10 | 2014-07-14 | Second Class | PS-18970 | c90d076ff45727789cb1742f443028e1 | Home Office | Petapa | Guatemala | Guatemala | NaN | LATAM | Central | FUR-BO-10001483 | Furniture | Bookcases | Bush Corner Shelving, Metal | 246.90000 | 3 | 0.000 | 32.04000 | 28.644 | Medium | NaN | 4 |
| 38467 | 44732 | 2014-11-26 | 2014-12-02 | Standard Class | CK-12205 | 8fe3138a7ef91d7f8635f63b9d5331ad | Consumer | Panama City | Panama | Panama | NaN | LATAM | Central | OFF-LA-10002015 | Office Supplies | Labels | Hon Round Labels, Alphabetical | 15.55200 | 6 | 0.400 | 1.99200 | 1.281 | Medium | NaN | 6 |
| 38468 | 38158 | 2011-10-14 | 2011-10-18 | Second Class | LR-17035 | a916b8bb7b9fcce602d0808e2eef7979 | Corporate | Agra | Uttar Pradesh | India | NaN | APAC | Central Asia | OFF-LA-10004894 | Office Supplies | Labels | Hon Shipping Labels, Alphabetical | 44.76000 | 4 | 0.000 | 20.04000 | 2.690 | High | NaN | 4 |
| 38469 | 860 | 2012-11-06 | 2012-11-08 | First Class | NW-18400 | 2b29848d9cbad1e31f5cc583c49922cb | Consumer | San Luis Potosí | San Luis Potosí | Mexico | NaN | LATAM | North | TEC-CO-10002009 | Technology | Copiers | Brother Wireless Fax, High-Speed | 1003.34928 | 4 | 0.002 | 178.94928 | 219.533 | Critical | Yes | 2 |
| 38470 | 15795 | 2014-10-28 | 2014-10-28 | Same Day | PV-18985 | c734b7f250b798431a1d83f7b585c499 | Home Office | Franca | São Paulo | Brazil | NaN | LATAM | South | OFF-PA-10002725 | Office Supplies | Paper | Eaton Cards & Envelopes, Premium | 60.24000 | 2 | 0.000 | 18.04000 | 18.306 | Critical | NaN | 0 |